Q-learning with Nearest Neighbors
نویسندگان
چکیده
We consider the problem of model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernels, when only a single sample path of the system is available. We focus on the classical approach of Q-learning where the goal is to learn the optimal Q-function. We propose the Nearest Neighbor Q-Learning approach that utilizes nearest neighbor regression method to learn the Q function. We provide finite sample analysis of the convergence rate using this method. In particular, we establish that the algorithm is guaranteed to output an ǫ-accurate estimate of the optimal Q-function with high probability using a number of observations that depends polynomially on ǫ and the model parameters. To establish our results, we develop a robust version of stochastic approximation results; this may be of interest in its own right.
منابع مشابه
A Hybrid Learning Strategy for Discovery of Policies of Action
This paper presents a novel hybrid learning method and performance evaluation methodology for adaptive autonomous agents. Measuring the performance of a learning agent is not a trivial task and generally requires long simulations as well as knowledge about the domain. A generic evaluation methodology has been developed to precisely evaluate the performance of policy estimation techniques. This ...
متن کاملScalable Secure Computation of Statistical Functions with Applications to k-Nearest Neighbors
Given a set S of n d-dimensional points, the k-nearest neighbors (KNN) is the problem of quickly finding k points in S that are nearest to a query point q. The k-nearest neighbors problem has applications in machine learning for classifications and regression and and also in searching. The secure version of KNN where either q or S are encrypted, has applications such as providing services over ...
متن کاملDevelopment of a Smart Home Context-aware Application: A Machine Learning based Approach
Context-awareness is an important characteristic of smart home. Several methods are used in context-aware application to provide services. The main target of smart home is to predict the demand of home users and proactively provide the proper services by computing user’s context information. In this paper, we present a context-aware application which can provide service according to predefined ...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملNearest Neighbors Problem
DEFINITION Given a set of n points and a query point, q, the nearest-neighbor problem is concerned with finding the point closest to the query point. Figure 1 shows an example of the nearest neighbor problem. On the left side is a set of n = 10 points in a two-dimensional space with a query point, q. The right shows the problem solution, s. Figure 1: An example of a nearest-neighbor problem dom...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.03900 شماره
صفحات -
تاریخ انتشار 2018